Lazy Sparse Stochastic Gradient Descent for Regularized Mutlinomial Logistic Regression

نویسنده

  • Bob Carpenter
چکیده

Stochastic gradient descent efficiently estimates maximum likelihood logistic regression coefficients from sparse input data. Regularization with respect to a prior coefficient distribution destroys the sparsity of the gradient evaluated at a single example. Sparsity is restored by lazily shrinking a coefficient along the cumulative gradient of the prior just before the coefficient is needed. 1 Multinomial Logistic Model A multinomial logistic model classifies d-dimensional real-valued input vectors x ∈ R into one of k outcomes c ∈ {0, . . . , k − 1} using k − 1 parameter vectors β0, . . . , βk−2 ∈ R: p(c | x, β) =  exp(βc · x) Zx if c < k − 1 1 Zx if c = k − 1 (1) where the linear predictor is inner product: βc · x = ∑

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asynchronous Stochastic Proximal Optimization Algorithms with Variance Reduction

Regularized empirical risk minimization (R-ERM) is an important branch of machine learning, since it constrains the capacity of the hypothesis space and guarantees the generalization ability of the learning algorithm. Two classic proximal optimization algorithms, i.e., proximal stochastic gradient descent (ProxSGD) and proximal stochastic coordinate descent (ProxSCD) have been widely used to so...

متن کامل

Fast Implementation of l 1 Regularized Learning Algorithms Using Gradient Descent Methods ∗

With the advent of high-throughput technologies, l1 regularized learning algorithms have attracted much attention recently. Dozens of algorithms have been proposed for fast implementation, using various advanced optimization techniques. In this paper, we demonstrate that l1 regularized learning problems can be easily solved by using gradient-descent techniques. The basic idea is to transform a ...

متن کامل

Efficient Elastic Net Regularization for Sparse Linear Models

We extend previous work on efficiently training linear models by applying stochastic updates to non-zero features only, lazily bringing weights current as needed. To date, only the closed form updates for the l1, l∞, and the rarely used l2 norm have been described. We extend this work by showing the proper closed form updates for the popular l22 and elastic net regularized models. We show a dyn...

متن کامل

Fast Implementation of ℓ1Regularized Learning Algorithms Using Gradient Descent Methods

With the advent of high-throughput technologies, l1 regularized learning algorithms have attracted much attention recently. Dozens of algorithms have been proposed for fast implementation, using various advanced optimization techniques. In this paper, we demonstrate that l1 regularized learning problems can be easily solved by using gradient-descent techniques. The basic idea is to transform a ...

متن کامل

A coordinate gradient descent method for ℓ1-regularized convex minimization

In applications such as signal processing and statistics, many problems involve finding sparse solutions to under-determined linear systems of equations. These problems can be formulated as a structured nonsmooth optimization problems, i.e., the problem of minimizing `1-regularized linear least squares problems. In this paper, we propose a block coordinate gradient descent method (abbreviated a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008